-
Notifications
You must be signed in to change notification settings - Fork 43
Migrate away from initContainers for CA Bundle operations #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Doug Edgar <[email protected]>
Signed-off-by: Doug Edgar <[email protected]>
ca9bb3a to
ab2c259
Compare
| } | ||
|
|
||
| func (r *LlamaStackDistributionReconciler) processODHConfigMapKeys(configMap *corev1.ConfigMap, keys []string, collector *certificateCollector) error { | ||
| for _, key := range keys { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to explicitly have odh specific function here instead of just using the above function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was intended to be different from processConfigMapKeys in a couple ways.
For processConfigMapKeys, if the user specifies a ConfigMap that doesn't exist or a Key that has formatting issues, the deployment should fail until those issues are resolved by the user.
If the ODH bundle has some kind of issue, it may not be something that can be resolved by the user if they don't also manage the ODH instance. So the processODHConfigMapKeys function only warns them about it and then continues.
|
|
||
| // Skip scaling test if PVC with ReadWriteOnce is used | ||
| // Most cloud providers (AWS EBS, Azure Disk, GCE PD) only support ReadWriteOnce for block storage | ||
| // ReadWriteMany requires network file systems (NFS, CephFS, etc.) which may not be available |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change? Maybe be remove it or add it as a separate commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this one was added to avoid errors encountered during testing. Now that the initContainer has been removed, the scale test is trying to attach the PVC to two separate containers at the same time, instead of after the initContainer finishes starting up.
If this PR merges, I'm planning on making a follow-up spike ticket to see if this race condition can be worked around somehow in a future commit.
| "VLLM_TLS_VERIFY": controllers.CABundleMountPath, | ||
| "SSL_CERT_FILE": controllers.ManagedCABundleFilePath, | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the tests, we should check for pod status as well https://github.com/llamastack/llama-stack-k8s-operator/pull/177/files#diff-eba972a48e969e438400b46aa87a98225eb27445bbe5d7f2c2fbb2fbdcd25b45R533
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I've updated this PR with a new commit to check for pods in the Ready status.
Signed-off-by: Doug Edgar <[email protected]>
ab2c259 to
301d72a
Compare
To provide better support for disconnected environments, the operator's initContainers are being removed.
This PR removes the initContainer that performs CA Bundle concatenation, and now processes cert data from ConfigMaps directly before mounting the final result into the llama-stack container.
Validation steps are performed to ensure that only legitimate certificate data is being passed to the llama-stack process in the container. Other basic size /number validation is done to ensure the operator stays within its resource limits.
Startup script additions were avoided this time, so that intentions like RHAIENG-1438 can be carried out without additional logic overhead.
Supersedes #174
Closes RHAIENG-1383